Identifying Compiler and Optimization Level in Binary Code From Multiple Architectures
نویسندگان
چکیده
While compiling a native application, different compiler flags or optimization levels can be configured. This choice depends on the requirements. For example, if application binary is intended for final release, and settings should set execution speed efficiency. Alternatively, to used debugging purposes, debug configured accordingly, usually involving minor no code optimization. However, this information cannot easily extracted from compiled binary. Nonetheless, ensuring same compilation particularly important when comparing files, avoid inaccurate unreliable analyses. Unfortunately, understand which optimizations have been used, deep knowledge of target architecture required. In study, we present two learning models detect both level in The study are O0, O1, O2, O3, Os x86_64, AArch64, RISC-V, SPARC, PowerPC, MIPS, ARM architectures. addition, x86_64 AArch64 architectures, also determine whether GCC Clang. We created dataset more than 76000 binaries it training. Our experiments showed over 99.99% accuracy detecting between 92% 98%, depending architecture, level. Furthermore, analyzed change amount data was extremely limited. shows that possible accurately flag with function-level granularity.
منابع مشابه
Optimization and Code Generation in a Compiler for Several Machines
This paper describes Optimization techniques that have been implemented in a compiler which was designed to produce code comparable to that produced by hand. Additional optimization methods were incorporated into successive versions of the compiler. It MJUS found that no single method was effective with all compiled programs but that each of the techniques described was effective for some progr...
متن کاملCompiler Technology for Migrating Sequential Code to Multi-threaded Architectures
Executing sequential code in parallel on a multithreaded machine has been an elusive goal for many years. It has recently become quite important due to the widespread introduction of multi-cores in PCs. Automatic multi-threading could not be achieved so far because classic compiler analysis was not powerful enough and program behavior was found to be in many cases input dependent. Run time, spe...
متن کاملHigh-Level Code Optimization
Software systems are inherently complex. Building large software systems has proved so difficult precisely because of the complexity levels with which programmers have to deal. In [7] Brooks divides complexity in essential and accidental and argues that solutions which worked in other fields cannot apply to software. Essential complexity stems from very nature of software (i.e. the large number...
متن کاملIdentifying Multiple Authors in a Binary Program
Knowing the authors of a binary program has significant application to forensics of malicious software (malware), software supply chain risk management, and software plagiarism detection. Existing techniques assume that a binary is written by a single author, which does not hold true in real world because most modern software, including malware, often contains code from multiple authors. In thi...
متن کاملAn Instrumenting Compiler for Enforcing Confidentiality in Low-Level Code
We present an instrumenting compiler for enforcing data confidentiality in low-level applications (e.g. those written in C) in the presence of an active adversary. In our approach, the programmer marks secret data by writing lightweight annotations on top-level definitions in the source code. The compiler then uses a static flow analysis coupled with efficient runtime instrumentation, a custom ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2021
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2021.3132950